Meta-Gradient Boosted Decision Tree Model for Weight and Target Learning

نویسندگان

  • Yury Ustinovsky
  • Valentina Fedorova
  • Gleb Gusev
  • Pavel Serdyukov
چکیده

Labeled training data is an essential part of any supervised machine learning framework. In practice, there is a trade-off between the quality of a label and its cost. In this paper, we consider a problem of learning to rank on a large-scale dataset with low-quality relevance labels aiming at maximizing the quality of a trained ranker on a small validation dataset with high-quality ground truth relevance labels. Motivated by the classical Gauss-Markov theorem for the linear regression problem, we formulate the problems of (1) reweighting training instances and (2) remapping learning targets. We propose meta– gradient decision tree learning framework for optimizing weight and target functions by applying gradient-based hyperparameter optimization. Experiments on a large-scale real-world dataset demonstrate that we can significantly improve state-of-the-art machine-learning algorithms by incorporating our framework.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluating Hospital Case Cost Prediction Models Using Azure Machine Learning Studio

Ability for accurate hospital case cost modelling and prediction is critical for efficient health care financial management and budgetary planning. A variety of regression machine learning algorithms are known to be effective for health care cost predictions. The purpose of this experiment was to build an Azure Machine Learning Studio tool for rapid assessment of multiple types of regression mo...

متن کامل

Comparing ensembles of decision trees and neural networks for one-day-ahead streamflow prediction

Ensemble learning methods have received remarkable attention in the recent years and led to considerable advancement in the performance of the regression and classification problems. Bagging and boosting are among the most popular ensemble learning techniques proposed to reduce the prediction error of learning machines. In this study, bagging and gradient boosting algorithms are incorporated in...

متن کامل

Urban Link Travel Time Prediction Based on a Gradient Boosting Method Considering Spatiotemporal Correlations

The prediction of travel times is challenging because of the sparseness of real-time traffic data and the intrinsic uncertainty of travel on congested urban road networks. We propose a new gradient–boosted regression tree method to accurately predict travel times. This model accounts for spatiotemporal correlations extracted from historical and real-time traffic data for adjacent and target lin...

متن کامل

Incorporating Boosted Regression Trees into Ecological Latent Variable Models

Important ecological phenomena are often observed indirectly. Consequently, probabilistic latent variable models provide an important tool, because they can include explicit models of the ecological phenomenon of interest and the process by which it is observed. However, existing latent variable methods rely on handformulated parametric models, which are expensive to design and require extensiv...

متن کامل

Optimization with Gradient-Boosted Trees and Risk Control

Decision trees effectively represent the sparse, high dimensional and noisy nature of chemical data from experiments. Having learned a function from this data, we may want to thereafter optimize the function, e.g., picking the best chemical process catalyst. In this way, we may repurpose legacy predictive models. This work studies a large-scale, industrially-relevant mixed-integer quadratic opt...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016